Emergent Criticality Through Adaptive Information Processing in Boolean Networks 
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We study information processing in populations of Boolean networks with evolving connectivity 
and systematically explore the interplay between the learning capability, robustness, the network 
topology, and the task complexity. We solve a long-standing open question and find computationally 
that, for large system sizes A*', adaptive information processing drives the networks to a critical 
connectivity Kc ~ 2. For finite size networks, the connectivity approaches the critical value with a 
power-law of the system size A''. We show that network learning and generalization are optimized 
near criticality, given task complexity and the amount of information provided surpass threshold 
values. Both random and evolved networks exhibit max;imal topological diversity near Kc- We 
hypothesize that this diversity supports efficient exploration and robustness of solutions. Also 
reflected in our observation is that the variance of the fitness values is maximal in critical network 
populations. Finally, we discuss implications of our results for determining the optimal topology of 
adaptive dynamical networks that solve computational tasks. 
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In 1948, Alan Turing proposed several unorganized 
machines made up from randomly interconnected two- 
input NAND logic gates [l[ as a biologically plausible 
model for computing. He also proposed to train such net- 
works by means of a "genetical or evolutionary search." 
Much later, random Boolean networks (RBN) were intro- 
duced as simplified models of gene regulation , focus- 
ing on a system- wide perspective rather than on the often 
unknown details of regulatory interactions 0|- In the 
thermodynamic limit, these disordered dynamical sys- 
tems exhibit a dynamical order-disorder transition at a 
sparse critical connectivity Kc @| • For a finite system size 
N, the dynamics of RBNs converge to periodic attractors 
after a finite number of updates. At K^, the phase space 
structure in terms of attractor periods [6| , the number of 
different attractors [7] and the distribution of basins of 
attraction |15j is complex, showing many properties rem- 
iniscent of biological networks Q . In cellular automata 
(CA) , complex computation has been hypothesized to oc- 
cur where the rules show complex dynamics at "the edge 
of chaos" This claim was refuted in [T^]. However, 

the argument in [lo| rests on symmetric spaces in the 
CA lattice and rule space. These results therefore do not 
apply to RBN. Phase transition in information dynamics 
was studied in 11 ll. State-topology coevolution in RBNs 
was studied by |12h14 and it was shown that networks 
evolved toward a critical connectivity Kc = 2. This let- 
ter presents the first study to link complex dynamics, 
topology, and task solving in an open RBN. 



In [TBLn^ simulated annealing (SA) and genetic algo- 
rithms (GAs) were used to train feedforward RBNs and 
to study the thermodynamics of learning. For a given 
task with predefined input-output mappings, only a frac- 
tion of the input space is required to train networks that 
generalize perfectly on all input patterns. This fraction 
depends on the network size and the task complexity. 
Moreover, the more inputs a task has, the smaller the 
training set needs to be to obtain full generalization. In 
this context, learning refers to correctly solving the task 
for the training samples while generalization refers to cor- 
rectly solving the task for novel inputs. We use adapta- 
tion to refer to the phase where networks have to adapt 
to ongoing mutations (i.e., noise and fluctuations), but 
have already learned the input-output mapping. In this 
Letter, we study adaptive information processing in pop- 
ulations of Boolean networks with an evolving topology. 
Rewiring of connections and mutations of the functions 
occur at random, without bias toward particular topolo- 
gies (e.g., feedforward). We systematically explore the 
interplay between the learning capability, the network 
topology, the system size N, the training sample T, and 
the complexity of the computational task. 

First, let us define the dynamics of RBNs. A RBN is 
a discrete dynamical system composed of N automata. 
Each automaton is a Boolean variable with two possi- 
ble states: {0, 1}, and the dynamics is such that F : 
{0,1}^ ^ {0,1}^, where = (/i, ...,/„..., /at), and 
each fi is represented by a look-up table of Ki inputs 
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randomly chosen from the set of A'' automata. Initially, 
Ki neighbors and a look-table are assigned to each au- 
tomaton at random. For practical reasons we restrict 
the maximum K.^ to 8. An automaton state aj G {0, 1} 
is updated using its corresponding Boolean function, 
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The automata are updated synchronously using their 
corresponding Boolean functions. For the purpose of 
solving computational tasks, we define I inputs and O 
outputs. The inputs of the computational task are ran- 
domly connected to an arbitrary number of automata. 
The connections from the inputs to the automata are 
subject to rewiring and are counted to determine the av- 
erage network connectivity {K). The outputs are read 
from a randomly chosen but fixed set of O automata. 
All automata states are initialized to "0" for each input 
pattern before simulating the network. 

Methodology. — We evolve the networks by means of 
a traditional genetic algorithm (GA) to solve three com- 
putational tasks of varying difficulty, each of which de- 
fined on a 3-bit input: full-adder (FA), even-odd (EO), 
and the cellular automata rule 85 (R85) The FA 

task receives two binary inputs A, B, an input carry 
bit Cin, and outputs the binary sum of the three in- 
puts S = A + i? + Cin on the 2-bit output and the 
carry bit Cout- The EO task outputs a 1 if there is 
an odd number of Is in the input (independent of the 
order), a otherwise. R85 is defined for three binary 
inputs A, i?, and C, and outputs the negation of C. 
The output for R85 task therefore only depends on one 
input bit. The EO task represents the most difficult 
task, followed by the FA and R85 task. Task difficulty 
is the complexity of information integration needed in 
the input to determine the output. This can be mea- 
sured through information-theoretical decomposability of 
a task. We can represent the task itself as the contin- 
gency table of its inputs and outputs. Different decom- 
position models of the task are the different ways that we 
can calculate the marginal probabilities from the origi- 



nal contingency table [21|. We calculate a weighted sum 
of the vector of the information content of all possible 
decomposition models of a task F . This can be sum- 



marized in: decomposition^ = ^ 
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where Modelsp is the set of all decomposition models 
of F, and the weight w„i of a model is proportional to its 
degrees of freedom. The information content of a model 
is calculated using /n/,„ = 1 — ^"^'^^^ . Here, Hm is 
the entropy of the model. Hp is the entropy of F, and 
Hind is the entropy of the independence model (all input 
and output variables are assumed independent). Higher 
values for decomposition p mean that the task is more 
decomposable and therefore less difficult. 

The genetic algorithm we use is mutation-based only, 
i.e., no cross-over operation is applied. For all experi- 
ments we ran a population of 30 networks with initial 
connectivity (Kin) = 1 and a mutation rate of 0.8. Each 



mutation is decomposed into 1 + a steps repeated with 
probability p{a) = 0.5"+^, where a > 0. Each step in- 
volves hipping a random location of the look-up table of 
a random automata combined with adding or deleting 
one link. Each population is run for 30, 000 generations. 
We repeat each evolutionary run 30 times and average the 
results. In each generation and for each tested input con- 
figuration, the RBN is run for a convergence time t (x N 
updates. Afterward, we run the network for an addi- 
tional t (X N updates to record the activity of the output 
nodes. If the activity of an output node is "1" for at least 
half of the t time steps, we interpret the output as a "1" , 
and as "0" otherwise. For an evolutionary run of training 
size T, the training sample set M is randomly chosen at 
each generation without replacement from the 2^ possi- 
ble input patterns. During each generation, the fitness 
of each individual is determined by / = 1 — Em, where 
E]\i is the normalized average error over the T random 

training samples: Em = y Y^i&M Y^j^o ~ "y)^- °y 
is the value of the output automata j for the input pat- 
tern i, and Oij is the correct value of the same bit for 
the corresponding task. The generalization score is cal- 
culated using the same equation with M including all 2^ 
inputs rather than a random sample. Finally, selection 
is applied to the population as a deterministic tourna- 
ment. Two individuals are picked randomly from the old 
population and their fitness values are compared. The 
better individual is mutated and inserted into the new 
population, the worse individual is discarded. We repeat 
the process until we have 30 new individuals in the new 
population. 

Results. — We observe a convergence of (K) close to 
the critical value Kc = 2 for large system sizes N and 
training sample sizes larger or equal to T = 4. For 
T = 8, populations always evolve close to criticality for 
moderate N. For smaller T, the average over all evo- 
lutionary runs is found at slightly higher values of (K) 
(Fig. [1]). If the average is taken only over the best indi- 
viduals, however, {K) values close to Kc are recovered. 
This observation can be explained from the fact that for 
T < 8, due to the limited information provided for learn- 
ing, some populations cannot escape local optima, and 
hence do not reach maximum fitness. Sub-optimal net- 
work populations show a large scatter in (K) values in 
the evolutionary steady state, while those with high fit- 
ness scores cluster around Kc = 2 (Fig. [U inset). For 
the simple R85 task we do not observe any convergence 
to Kc — 2, independent of the training samples. For the 
other tasks, the finite size scaling of (K) (Fig.[2|) exhibits 
convergence towards Kc with a power-law as a function 
of the system size N. For T — 8, the exponent b of the 
power-law for the three tasks EO, FA, and R85 is —1.63, 
— 1.11, and —0.30 respectively (Fig.[2|). Altogether, these 
results suggest that the amount of information provided 
by the input training sample helps to drive the network 
to a critical connectivity. 
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FIG. 1. Convergence of the average network connectivity as 
a function of the GA generations tg. FA task with T = 4 and 
N = 100. The curves are averaged over 30 evolutionary runs 
(red), only the 22 best (green), and the 15 best (hght blue) 
populations, respectively. Inset: scatter plot correlating aver- 
age {K){tg) and average generalization {G){tg) of a successful 
population (black) and a suboptimal population (purple). 
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FIG. 2. Finite size scaling of (K) as a function of A'^ for the 
three tasks, EO (black), FA (blue), R85 (magenta), and the 
training sample size T = 4 (a) and T = 8 (b). Points represent 
the data of the evolved networks, lines represent the fits. The 
finite size scaling for {K) shows that it scales with a power-law 
as a function of the system size A^. The dashed lines represent 
the power-law fit of the form a * a;*" + c. We favor the data for 
larger N by weighting the data according to N/Nmax, where 
Nmax = 500. The insets show — c as a function of A^ on a 
log-log scale. 



Interestingly, the population dynamics in our model 
follow Fisher's fundamental theorem of natural selection, 
which attributes the rate of increase in the mean fitness 
to the increased fitness variance in the population [j^ . 
It has been shown in GAs that the diversity maximiza- 
tion makes more configurations of the search space 
accessible to the genetic search to find optimal solutions 

IMl- 

Indeed, we find that the standard deviation of the fit- 
ness values in the populations has a local maximum near 
Kc (Fig.m inset), with a sharp decay toward larger (if), 
indicative of maximum diversity near criticality. Evi- 
dently, this diversity helps to maintain a high fitness 
population in the face of continuous mutations with a 
fairly high rate (0.8 in our study). While the average 
fitness can be lower (and often is), compared to less di- 
verse populations, the probability to find and maintain 




FIG. 3. Near K^, the topology of the network shows maxi- 
mal variance. The insets show the standard deviation of the 
topological measures for initial ER networks (magenta), the 
evolved networks (black), and the XRG (blue). The solid lines 
represent the topological measures in random networks. The 
dotted line represents the same measure in RBNs. 



high fitness solutions is strongly increased. Indeed, we 
find that populations where the best mutant has maxi- 
mum fitness (/ = 1) sharply peak near Kc (Fig. SJ, as 
well as populations where the best mutant reaches perfect 
generalization. To find a possible source of fitness diver- 
sity, we determined several topological measures of the 
networks [3]: the eccentricity (maximum shortest path 
between a vertex v and any other vertex in a graph) , the 
betweenness centrality (the average fraction of shortest 
paths between all vertices in a graph that passes through 
a vertex w), the participation, and the characteristic path 
length. These measures were calculated for Erdos-Renyi 
(ER), exponential Random Graphs (XRG), as well as 
for the evolved networks (Fig. [3]). In fact, we find that 
the graph-theoretical measures have maximal variance 
near Kc = 2. Similarly, other authors have shown that 
dynamical diversity is maximized near Kc^ too ^25] . Our 
results suggest that evolving RBN can indeed exploit this 
diversity to optimize learning. 

In addition, we find that during the learning process of 
the networks, the in-degree distribution changes from a 
Poissonian to an exponential distribution. In particular, 
we observe that the topological properties of the networks 
reach a compromise between ER graphs and the XRG. 
The same observation was made in input- and output- 
less RBNs that were driven to criticality by using a local 
rewiring rule [13] • This significant topology change is 
related to diversity (entropy) maximization during the 
learning phase [2^. However, this is beyond the scope of 
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FIG. 4. The conditional probability that evolving popula- 
tions, where the best mutant reaches maximum fitness (i.e., 
fbest = fmax = 1), havc average connectivity (K) shows a 
sharp peak near Kc (black curve), the same is found for max- 
imum generalization (light blue). Inset: diversity of evolving 
populations, quantified in terms of the standard deviation 
a"(/) of fitness distributions, has a maximum near Kc = 2. 
All data sampled over the best 22 out of 30 populations for 
full-adder task with T = 4 and iV = 100. 



this paper and will be discussed in a separate publication. 

Finally, we measured the damage spreading in the 
evolved RBNs [IJ to determine their dynamical regime. 
The damage spreading dt+i is measured by changing the 
state of a randomly selected node in two identical net- 
works. The two networks are simulated for a single time 
step and the damage spreading d is then calculated by 
averaging the ratio over many trails with random 
initial network configurations. One observes that for crit- 
ical networks d = I, for supercritical networks d > 1 , and 
for subcritical networks d < 1. We see that for networks 
with a high fitness, d peaks around 1 for all N. 

Discussion. — We investigated the learning and gen- 
eralization capabilities in RBNs and showed that they 
evolve toward a critical connectivity of « 2 for large 
networks and large input sample sizes. For finite size 
networks, the connectivity approaches the critical value 
with a power-law of the system size N. We showed that 
network learning and generalization are optimized near 
criticality, given task complexity and the amount of in- 
formation provided surpass threshold values. Further- 
more, critical REN populations exhibit the largest diver- 
sity (variance) in fitness values, which supports learn- 
ing and robustness of solutions under continuous mu- 
tations. By considering graph-theoretical measures, we 
determined that corresponds to a region in network 
ensemble space where the topological diversity is maxi- 
mized, which may explain the observed diversity in crit- 
ical populations. 

Interestingly, we observe that RBN populations that 
are optimal with respect to learning and generalization 
tend to show average connectivity values slightly below 
(K) — 2. This may be related to previous results indi- 



the evolution, the attractor landscape changes so that 
there are enough attractors to properly process the in- 
puts. The entire task is encoded as a hyper cycle (i.e., 
a set of mutually reachable attractors) in the network 
dynamics. The input combinations play the role of a 
multi- valued switch that pushes the dynamics out of one 
attractor into the next along the hyper cycle. Emergence 
of the large attractor basins make the computation highly 
robust to perturbations in the node state while maintain- 
ing sensitivity to input signals. All networks in our final 
population converge to fixed-point or cyclic attractors. 

To summarize, we solved a long-standing question and 
showed that learning of classification tasks and adapta- 
tion can drive RBNs to the "edge of chaos" [3], where 
high-diversity populations are maintained and on-going 
adaptation and robustness are optimized. Our study may 
have important implications for determining the optimal 
topology of a much larger class of complex dynamical 
networks where adaptive information processing needs 
to be achieved efficiently, robustly, and with limited con- 
nectivity (i.e., resources). This has applications, e.g., 
in the area of neural networks, complex networks, and 
more specifically in the area of emerging molecular and 
nanoscale networks and computing devices, which are ex- 
pected to be built in a bottom-up way from vast numbers 
of simple, densely arranged components that exhibit high 
failure rates, are relatively slow, and connected in an un- 
structured way. 
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eating that Kc < 2 in finite size RBN 

Examination of the attractors of the final population 
confirms that the computation happens as partitioning 
of the state-space into disjoint attractors [28|. During 
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